On glottal source shape parameter transformation using a novel deterministic and stochastic speech analysis and synthesis system
نویسندگان
چکیده
In this paper we present a flexible deterministic plus stochastic model (DSM) approach for parametric speech analysis and synthesis with high quality. The novelty of the proposed speech processing system lies in its extended means to estimate the unvoiced stochastic component and to robustly handle the transformation of the glottal excitation source. It is therefore well suited as speech system within the context of Voice Transformation and Voice Conversion. The system is evaluated in the context of a voice quality transformation on natural human speech. The voice quality of a speech phrase is altered by means of resynthesizing the deterministic component with different pulse shapes of the glottal excitation source. A subjective listening test suggests that the speech processing system is able to successfully synthesize and arise to a listener the perceptual sensation of different voice quality characteristics. Additionally, improvements of the speech synthesis quality compared to a baseline method are demonstrated.
منابع مشابه
Glottal source and vocal-tract separation Estimation of glottal parameters, voice transformation and synthesis using a glottal model
This study addresses the problem of inverting a voice production model to retrieve, for a given recording, a representation of the sound source which is generated at the glottis level, the glottal source, and a representation of the resonances and anti-resonances of the vocal-tract. This separation gives the possibility to manipulate independently the elements composing the voice. There are man...
متن کاملVoice quality transformation using an extended source-filter speech model
In this paper we present a flexible framework for parametric speech analysis and synthesis with high quality. It constitutes an extended source-filter model. The novelty of the proposed speech processing system lies in its extended means to use a Deterministic plus Stochastic Model (DSM) for the estimation of the unvoiced stochastic component from a speech recording. Further contributions are t...
متن کاملShape parameter estimate for a glottal model without time position
From a recorded speech signal, we propose to estimate a shape parameter of a glottal model without estimating his time position. Indeed, the literature usually propose to estimate the time position first (ex. by detecting Glottal Closure Instants). The vocal-tract filter estimate is expressed as a minimum-phase envelope estimation after removing the glottal model and a standard lips radiation m...
متن کاملGlottal Closure Instant detection from a glottal shape estimate
The GCI detection is a common problem in voice analysis used for voice transformation and synthesis. The proposed innovative idea is to use a glottal shape estimate and a standard lips radiation model instead of the common pre-emphasis when computing the vocal-tract filter estimate. The time-derivative glottal source is then computed from the division in frequency of the speech spectrum by the ...
متن کاملSemi Parametric Concatenative TTS with Instant Voice Modification Capabilities
Recently, a glottal vocoder has been integrated in the IBM concatenative TTS system and certain configurable global voice transformations were defined in the vocoder parameter space. The vocoder analysis employs a novel robust glottal source parameter estimation strategy. The vocoder is applied to the voiced speech only, while unvoiced speech is kept unparameterized, thus contributing to the pe...
متن کامل